Unsupervised outlier detection in multidimensional data

نویسندگان

چکیده

Abstract Detection and removal of outliers in a dataset is fundamental preprocessing task without which the analysis data can be misleading. Furthermore, existence anomalies heavily degrade performance machine learning algorithms. In order to detect an unsupervised manner, some novel statistical techniques are proposed this paper. The based on methods considering compactness other properties. newly ideas found efficient terms performance, ease implementation, computational complexity. two presented paper use transformation unidimensional distance space outliers, so irrespective data’s high dimensions, remain computationally inexpensive feasible. Comprehensive anomaly detection schemes paper, better than state-of-the-art when tested several benchmark datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Very Fast Outlier Detection in Large Multidimensional Data Sets

Outliers are objects that do not comply with the general behavior of the data. Applications such as exploration in science databases need fast interactive tools for outlier detection in data sets that have unknown distributions, are large in size, and are in high dimensional space. Existing algorithms for outlier detection are too slow for such applications. We present an algorithm based on an ...

متن کامل

Outlier Detection in Multivariate Data

The objective of this research is detection of outliers in multivariate data employing various distance measure, particularly using robust regression diagnosis technique. Several classical outlier identification methods are based on the sample mean and covariance matrix in general. But they do not always yield better result, as they themselves are affected by the outliers. Sometimes one outlier...

متن کامل

Outlier detection in astronomical data

Astronomical data sets have experienced an unprecedented and continuing growth in the volume, quality, and complexity over the past few years, driven by the advances in telescope, detector, and computer technology. Like many other fields, astronomy has become a very data rich science. Information content measured in multiple Terabytes, and even larger, multi Petabyte data sets are on the horizo...

متن کامل

Outlier Detection with Uncertain Data

In recent years, many new techniques have been developed for mining and managing uncertain data. This is because of the new ways of collecting data which has resulted in enormous amounts of inconsistent or missing data. Such data is often remodeled in the form of uncertain data. In this paper, we will examine the problem of outlier detection with uncertain data sets. The outlier detection probl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Big Data

سال: 2021

ISSN: ['2196-1115']

DOI: https://doi.org/10.1186/s40537-021-00469-z